fix #3587: adding inform support for limit/batch fetching #3753

shawkins · 2022-01-15T15:46:09Z

Description

Related to #3587 #3616 - adds limit / batch fetching support to informers. This can cut down on the size of the responses from the server and eventually the memory footprint of an informer. To support the latter this will not fall back to fetching the entire list like the go client - this may need some refinement as currently it will just keep retrying.

If the limit is not specified, the default is to fetch everything in a single list.

A behavioral difference introduced here is that the affect of a relist is no longer atomic on the cache - rather each item in the relist will perform a cache update and create an event as it is seen. Since the cache is still eventually consistent either way, I think this change is not a concern.

The javadocs call out a performance warning - it seems that pagination hits etc.d directly and so there's a concern about performance and scaling.

These changes to conflict of course with the api module creation, but that will be easy to resolve.

Type of change

Bug fix (non-breaking change which fixes an issue)
Feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change
Chore (non-breaking change which doesn't affect codebase;
test, version modification, documentation, etc.)

Checklist

Code contributed by me aligns with current project license: Apache 2.0
I Added CHANGELOG entry regarding this change
I have implemented unit tests to cover my changes
I have added/updated the javadocs and other documentation accordingly
No new bugs, code smells, etc. in SonarCloud report
I tested my code in Kubernetes
I tested my code in OpenShift

centos-ci · 2022-01-15T15:46:10Z

Can one of the admins verify this patch?

shawkins · 2022-01-15T17:28:29Z

@metacosm @attilapiros Some additional thoughts - it seems to be a pretty advanced case to consider using the limit option with an informer. There are quite a few comments around the behavior in the go client.

I don't see much in the literature about it being a built-in case, but it seems like you'd also look towards a sharding pattern - an overall/coordinating operator would create something like a statefulset of controllers, where each has responsibility for a subset of the resources - driven by an annotation (something that is filterable by informers/kubernetes) added by the overall operator.

manusa · 2022-01-17T13:10:32Z

https://github.com/kubernetes/kubernetes/blob/6f896dec4f45ffb82491bc9ce2393e7452886562/staging/src/k8s.io/client-go/tools/cache/reflector.go#L92

manusa · 2022-01-17T13:22:25Z

I don't see much in the literature about it being a built-in case, but it seems like you'd also look towards a sharding pattern - an overall/coordinating operator would create something like a statefulset of controllers, where each has responsibility for a subset of the resources - driven by an annotation (something that is filterable by informers/kubernetes) added by the overall operator.

I can't find now the source, but I recall that operators should be singletons (or work with some sort of leader-election mechanism). (Which is something that disappointment given the nature of K8s and its overall purpose)

Is this new sharding approach something documented?

I understand that this has nothing to do with the scope of the current PR, or does it?

shawkins · 2022-01-17T13:39:28Z

Is this new sharding approach something documented?

No, just a musing on how you would approach it assuming a "singleton" coordinating operator.

I understand that this has nothing to do with the scope of the current PR, or does it?

So far the only request for this limit/batching logic is to work towards scaling operators. If there are other applicable approaches we should determine what is best to ultimately peruse. The next step from here is to control the memory footprint of the cache - from #3636 that either looks like a mapping/pruning function or a store that's not fully in memory. I don't think the latter seems like the right direction.

As for this change it does seem like it can stand alone based upon the go client. There are situations where it is beneficial to form smaller paginated results.

shawkins · 2022-01-17T14:48:29Z

No, just a musing on how you would approach it assuming a "singleton" coordinating operator.

To elaborate further - you'd have the coordinating operator set a hash on the resource as a label in the range of the number of statefulset replicas. The statefulset pods would be setup to provide the pod name via an env property, which means they could parse out the pod index to determine which bucket they were responsible for. The individual informers would then be setup to only look for resources with that index label. Any dependent resources would need to have that label transitively set as well for the sharding pattern to work all the way down.

manusa · 2022-01-17T14:59:43Z

To elaborate further [...]

Yes, I figured some mechanism like this. However, I might still see some flaws depending on the use-case (from my undetailed, first-glance perspective). For some other uses, it might not be necessary to have an orchestrating process/controller (i.e. considering a StatefulSet, the Pod's ordinal name might be enough to devise a constant strategy to self-assign the bucket).

Anyway, I guess this is really off topic for the PR.

shawkins · 2022-01-17T15:14:32Z

the Pod's ordinal name might be enough to devise a constant strategy to self-assign the bucket

To self-assign the hash/index label would already need to be present on the resources.

Anyway, I guess this is really off topic for the PR.

Yes, it was just to add some more to the discussion about operator scaling. I'll add a reference to this to the other on-going issue.

manusa · 2022-01-17T18:35:26Z

the hash/index label would already need to be present on the resources

Well, in my imaginary example the hash (or input for the resource assignment) could be computed from the resource UID or something similar.

shawkins · 2022-01-17T19:20:37Z

Well, in my imaginary example the hash (or input for the resource assignment) could be computed from the resource UID or something similar.

Something (which I'm referring to as the overall operator) has to compute a hash to the index and apply it as a label to the resource - it needs to be filterable by an informer. I tried to summarize all the extra work this would entail at #3587 (comment)

manusa · 2022-01-18T06:08:09Z

Well, in my imaginary example the hash (or input for the resource assignment) could be computed from the resource UID or something similar.

Something (which I'm referring to as the overall operator) has to compute a hash to the index and apply it as a label to the resource - it needs to be filterable by an informer. I tried to summarize all the extra work this would entail at #3587 (comment)

😅 I forgot about the filtering part :)

From your detailed explanations I get the following:

The Orchestrating Pod (Overall Operator) is responsible to assign a filterable property (in this case a label) to the resource to be processed. bottleneck
1-N worker Pods with shared informers that filter the assigned resources and perform the actual work. I guess that scaling might impose a challenge in order to rehash the existing resources(I think you expand on this on your comment).

Support for some sort of server-side dynamic labels or wildcard filtering would be of help to completely remove the orchestrating part (kubernetes/kubernetes#30755)

sonarcloud · 2022-01-20T11:47:39Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

62.8% Coverage
0.0% Duplication

amydentremont · 2022-05-24T19:45:35Z

Just wanted to say thanks for adding this feature! My team just ran into needing the ListerWatcher to paginate in the Informer's Reflector and we realized the ability was something we just got from upgrading versions 🙌

shawkins · 2022-05-25T21:49:02Z

Just wanted to say thanks for adding this feature!

No problem, hope it works out well.

shawkins requested review from manusa, rohanKanojia and metacosm January 15, 2022 15:46

oscerd approved these changes Jan 15, 2022

View reviewed changes

shawkins force-pushed the informer_batch branch from 25ed385 to 2e20ebd Compare January 15, 2022 16:34

shawkins force-pushed the informer_batch branch from 2e20ebd to 518c425 Compare January 15, 2022 17:28

manusa approved these changes Jan 17, 2022

View reviewed changes

shawkins mentioned this pull request Jan 17, 2022

Add a construct in between a Watch and an Informer #3587

Closed

manusa added this to the 5.12.0 milestone Jan 18, 2022

rohanKanojia approved these changes Jan 20, 2022

View reviewed changes

fix fabric8io#3587: adding inform support for limit/batch fetching

9ab00d4

manusa force-pushed the informer_batch branch from 518c425 to 9ab00d4 Compare January 20, 2022 11:24

manusa merged commit 36d6731 into fabric8io:master Jan 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix #3587: adding inform support for limit/batch fetching #3753

fix #3587: adding inform support for limit/batch fetching #3753

shawkins commented Jan 15, 2022

centos-ci commented Jan 15, 2022

shawkins commented Jan 15, 2022

manusa commented Jan 17, 2022

manusa commented Jan 17, 2022

shawkins commented Jan 17, 2022

shawkins commented Jan 17, 2022

manusa commented Jan 17, 2022 •

edited

Loading

shawkins commented Jan 17, 2022

manusa commented Jan 17, 2022

shawkins commented Jan 17, 2022

manusa commented Jan 18, 2022

sonarcloud bot commented Jan 20, 2022

amydentremont commented May 24, 2022

shawkins commented May 25, 2022

fix #3587: adding inform support for limit/batch fetching #3753

fix #3587: adding inform support for limit/batch fetching #3753

Conversation

shawkins commented Jan 15, 2022

Description

Type of change

Checklist

centos-ci commented Jan 15, 2022

shawkins commented Jan 15, 2022

manusa commented Jan 17, 2022

manusa commented Jan 17, 2022

shawkins commented Jan 17, 2022

shawkins commented Jan 17, 2022

manusa commented Jan 17, 2022 • edited Loading

shawkins commented Jan 17, 2022

manusa commented Jan 17, 2022

shawkins commented Jan 17, 2022

manusa commented Jan 18, 2022

sonarcloud bot commented Jan 20, 2022

amydentremont commented May 24, 2022

shawkins commented May 25, 2022

manusa commented Jan 17, 2022 •

edited

Loading